Packages
pkgs <- "
kthcorpus DT bslib htmltools dplyr downloadthis
"
import <- function(x)
x |> trimws() |> strsplit("\\s+") |> unlist() |>
lapply(function(x) library(x, character.only = TRUE)) |>
invisible()
pkgs |> import()pkgs <- "
kthcorpus DT bslib htmltools dplyr downloadthis
"
import <- function(x)
x |> trimws() |> strsplit("\\s+") |> unlist() |>
lapply(function(x) library(x, character.only = TRUE)) |>
invisible()
pkgs |> import()The Scopus APIs for publication search and extended abstracts data can be used to retrive metadata for Scopus publications.
Scopus data for KTH can be retrieved from Scopus APIs. This assumes environment variables for SCOPUS_API_KEY and SCOPUS_API_INSTTOKEN are available. These need to be present in the ~/.Renviron file. Requests counts towards a ratelimit quota, which can be checked using another function.
scopus <- scopus_search_pubs_kth()
scopus_ratelimit_quota()Due to the quota limit and since there is already a scheduled job providing the latest data, another better approach is to request the data from object storage.
scopus <- scopus_from_minio()Given a specific Scopus identifier for a publication, we can use a function to retrieve additional information including for example raw affiliation strings.
# use the first id
sid <- scopus$publications$`dc:identifier` |> head(1)
scopus_abstract_extended(sid)$scopus_abstract
# A tibble: 1 × 20
`dc:publisher` srctype prism…¹ prism…² sourc…³ cited…⁴ prism…⁵ subtype opena…⁶
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 MDPI j 2023-0… Journal 211002… 0 15 ar 1
# … with 11 more variables: `prism:issn` <chr>, `prism:issueIdentifier` <chr>,
# subtypeDescription <chr>, `prism:publicationName` <chr>,
# openaccessFlag <chr>, `prism:doi` <chr>, `dc:identifier` <chr>, lang <chr>,
# keywords <chr>, sid <chr>, `dc:description` <chr>, and abbreviated variable
# names ¹`prism:coverDate`, ²`prism:aggregationType`, ³`source-id`,
# ⁴`citedby-count`, ⁵`prism:volume`, ⁶openaccess
$scopus_authorgroup
# A tibble: 3 × 27
sid id i ce_gi…¹ prefe…² prefe…³ prefe…⁴ prefe…⁵ seq ce_in…⁶ fa
<chr> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 SCOPU… 1 1 Amir E… Amir E… A.E. Forouh… Forouh… 1 A.E. true
2 SCOPU… 2 1 Shahrz… Shahrz… S. Khosra… Khosra… 2 S. true
3 SCOPU… 3 1 Jafar Jafar J. Mahmou… Mahmou… 3 J. true
# … with 16 more variables: type <chr>, ce_surname <chr>, auid <chr>,
# orcid <chr>, ce_indexed_name <chr>, country <chr>, address_part <chr>,
# afid <chr>, country3 <chr>, city <chr>, organization <chr>,
# affiliation_id <chr>, ce_source_text <chr>, dptid <chr>, postal_code <chr>,
# raw_org <chr>, and abbreviated variable names ¹ce_given_name,
# ²preferred_name_ce_given_name, ³preferred_name_ce_initials,
# ⁴preferred_name_ce_surname, ⁵preferred_name_ce_indexed_name, ⁶ce_initials
$scopus_correspondence
# A tibble: 1 × 5
sid ce_given_name ce_initials ce_surname ce_indexed_name
<chr> <chr> <chr> <chr> <chr>
1 SCOPUS_ID:85147885260 Amir Esmael A.E. Forouhid Forouhid A.E.
In order to automatically look up known KTH identifiers for researchers (kthids) from ORCiDs, known associations can be made available so these are known up-front.
Note that this is not necessary since otherwise these are looked up on article by article basis. But it can be useful to speed up the process.
ko <- kthid_orcid()Different publication types require sligthly different kinds of MODS file content.
To work with Scopus articles, filter on the publication subtype, like so:
# subtype == "cp" # conference paper
# subtype == "ar" # article
# subtype == "ch" # book chapter
articles <- scopus$publications |> filter(subtype == "ar")To generate MODS for a specific article, we need first its Scopus identifier
sids <- articles$`dc:identifier`
sid <- sids |> head(1)
# we provide previous scopus search results and kthid_orcid pairs
# to avoid runtime lookups for this data
mods <- sid |> scopus_mods(scopus = scopus, ko = ko)
mods |> cat()<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-2.xsd">
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" version="3.7" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-7.xsd">
<genre authority="diva" type="contentTypeCode">referee</genre>
<genre authority="diva" type="publicationTypeCode">article</genre>
<genre authority="svep" type="publicationType">art</genre>
<genre authority="diva" type="publicationType" lang="eng">Article in journal</genre>
<genre authority="kev" type="publicationType" lang="eng">article</genre>
<name type="personal" authority="kth" href="NA">
<namePart type="family">Forouhid</namePart>
<namePart type="given">Amir Esmael</namePart>
<role>
<roleTerm type="code" authority="marcrelator">aut</roleTerm>
</role>
<affiliation>Department of Civil Engineering, Parand Branch, Islamic Azad University, P.O. Box 3761396361, Parand, Iran</affiliation>
<description>org.orcid=0000-0002-0419-2674</description>
</name>
<name type="personal" authority="kth" href="NA">
<namePart type="family">Khosravi</namePart>
<namePart type="given">Shahrzad</namePart>
<role>
<roleTerm type="code" authority="marcrelator">aut</roleTerm>
</role>
<affiliation>Department of Industrial Engineering, North Branch, Islamic Azad University, P.O. Box 1496969191, Tehran, Iran</affiliation>
</name>
<name type="personal" authority="kth" href="NA">
<namePart type="family">Mahmoudi</namePart>
<namePart type="given">Jafar</namePart>
<role>
<roleTerm type="code" authority="marcrelator">aut</roleTerm>
</role>
<affiliation>Department of Sustainable Production Development, School of Industrial Engineering and Management, KTH, 100 44 Stockholm, Sweden</affiliation>
</name>
<titleInfo lang="eng">
<title>Noise Pollution Analysis Using Geographic Information System, Agglomerative Hierarchical Clustering and Principal Component Analysis in Urban Sustainability (Case Study: Tehran)</title>
</titleInfo>
<originInfo>
<publisher>MDPI</publisher>
<dateIssued>2023</dateIssued>
<dateOther type="availableFrom">February 2023</dateOther>
</originInfo>
<physicalDescription>
<form authority="marcform">print</form>
</physicalDescription>
<identifier type="doi">10.3390/su15032112</identifier>
<identifier type="scopus">2-s2.0-85147885260</identifier>
<identifier type="eissn">20711050</identifier>
<identifier type="issn">NA</identifier>
<identifier type="articleId">2112</identifier>
<typeOfResource>text</typeOfResource>
<location>
<url>https://api.elsevier.com/content/abstract/scopus_id/85147885260</url>
</location>
<subject lang="eng">
<topic>noise contours maps</topic>
<topic>noise pollution</topic>
<topic>regression</topic>
<topic>traffic parameters</topic>
<topic>urban noise measurements</topic>
</subject>
<subject lang="eng" authority="hsv" xlink:href="20105">
<topic>Transport Systems and Logistics</topic>
</subject>
<subject lang="swe" authority="hsv" xlink:href="20105">
<topic>Transportteknik och logistik</topic>
</subject>
<subject lang="eng" authority="hsv" xlink:href="20199">
<topic>Other Civil Engineering</topic>
</subject>
<subject lang="swe" authority="hsv" xlink:href="20199">
<topic>Annan samhällsbyggnadsteknik</topic>
</subject>
<abstract lang="eng">In this study, a new approach has been used with SPSS and MATLAB analysis to study urban road traffic noise distribution mapping, to obtain the representative road traffic noise maps. The observation has been performed at a high traffic highway. The factors influencing noise level (traffic, road width, slope, and residential or administrative–commercial land) use were surveyed and recorded for each point and their local and time dependencies were computed. According to the analysis, the maximum value of goodness of fit index for the traffic and noise level relationship was 0.64, followed by 0.489 for the percentage of residential land use. The result of this study showed that the vehicle speed, width of the road, and the land use can affect different sound levels emitted by moving vehicles on road. The model predicts that by increasing one vehicle per hour, an increase in noise level by 0.002 dB will happen.</abstract>
<note>Imported from Scopus. VERIFY.</note>
<relatedItem type="host">
<titleInfo>
<title>Sustainability (Switzerland)</title>
</titleInfo>
<identifier type="issn">NA</identifier>
<part>
<detail type="volume">
<number>15</number>
</detail>
<detail type="issue">
<number>3</number>
</detail>
<extent>
<start>NA</start>
<end>NA</end>
</extent>
</part>
</relatedItem>
<!-- <note type="funder">@Funder@ [@project_number_from_funder@]</note> -->
</mods>
</modsCollection>
The function is vectorised which means it can iterate over several Scopus identifiers
my_sids <- sids |> head(10)
my_mods <- my_sids |> scopus_mods_crawl(scopus = scopus, ko = ko)Generating MODS parameters for 10 identifiers...
Generating MODS based on parameters...
Returning 9 MODS, failed attempts for: SCOPUS_ID:85147857562
names(my_mods)[1] "SCOPUS_ID:85147889326" "SCOPUS_ID:85147844696" "SCOPUS_ID:85147854514"
[4] "SCOPUS_ID:85147847843" "SCOPUS_ID:85147857562" "SCOPUS_ID:85147855964"
[7] "SCOPUS_ID:85147840721" "SCOPUS_ID:85147829983" "SCOPUS_ID:85147826149"
my_mods$`SCOPUS_ID:85147171092` |> cat()A zip-file with the results can be generated, and included for download in a quarto doc.